Computer Vision¶

  • Purpose: Create a user-friendly manual for beginners or enthusiasts to learn computer vision with Python tools like OpenCV and Pillow.

  • Components: Include technical explanations, step-by-step tutorials, and full project implementations for both image and video processing.

  • Prepared by:

    • Ali Abedini
    • Shayan Kashefi
    • Parsa Naseri

1. OpenCV Basics¶

Example: Displaying an Image

From the OpenCV Docs:

In [3]:
import cv2
from google.colab.patches import cv2_imshow

# Load an image
image = cv2.imread('example.jpg')

# Display the image
cv2_imshow(image)

# Wait for a key press and close the window
cv2.waitKey(0)
cv2.destroyAllWindows()
No description has been provided for this image

Example: Contour Detection

In [4]:
import cv2

image = cv2.imread('example.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 50, 150)
contours, _ = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

for contour in contours:
    cv2.drawContours(image, [contour], -1, (0, 255, 0), 2)

cv2_imshow(image)
cv2.waitKey(0)
cv2.destroyAllWindows()
No description has been provided for this image

Example: Real-Time Face Detection From Learn OpenCV:

In [20]:
!pip install opencv-python-headless
!pip install opencv-python
Requirement already satisfied: opencv-python-headless in /usr/local/lib/python3.11/dist-packages (4.10.0.84)
Requirement already satisfied: numpy>=1.21.2 in /usr/local/lib/python3.11/dist-packages (from opencv-python-headless) (1.26.4)
Requirement already satisfied: opencv-python in /usr/local/lib/python3.11/dist-packages (4.10.0.84)
Requirement already satisfied: numpy>=1.21.2 in /usr/local/lib/python3.11/dist-packages (from opencv-python) (1.26.4)
In [21]:
from google.colab import files
import cv2

# Upload the video file
uploaded = files.upload()

# Use the uploaded file (e.g., 'video.mp4')
video_path = list(uploaded.keys())[0]

# Load the pre-trained Haar Cascade classifier
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# Start video capture from the uploaded file
cap = cv2.VideoCapture(video_path)

# Get video properties
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Define the codec and create VideoWriter object
output_path = 'output.avi'
fourcc = cv2.VideoWriter_fourcc(*'XVID')  # Codec for AVI format
out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))

while True:
    ret, frame = cap.read()
    if not ret:
        break

    # Convert frame to grayscale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Detect faces
    faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))

    # Draw rectangles around detected faces
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)

    # Write the frame to the output video
    out.write(frame)

# Release resources
cap.release()
out.release()

# Download the output video
files.download(output_path)
Upload widget is only available when the cell has been executed in the current browser session. Please rerun this cell to enable.
Saving Einstein_1.mp4 to Einstein_1.mp4

2. Pillow Basics¶

Example: Opening, Rotating, and Saving an Image

From the Pillow Docs:

In [22]:
from PIL import Image

# Open an image file
image = Image.open('example.jpg')

# Rotate the image
rotated_image = image.rotate(45)

# Save the rotated image
rotated_image.save('rotated_example.jpg')
rotated_image.show()

Example: Creating a Thumbnail

In [23]:
from PIL import Image

image = Image.open('example.jpg')
image.thumbnail((100, 100))
image.save('thumbnail.jpg')

3. Project: Image Editor¶

Combining OpenCV and Pillow

Goal: A script for resizing and applying a filter to an image.

In [24]:
import cv2
from PIL import Image, ImageEnhance

# Load image using OpenCV
image = cv2.imread('example.jpg')

# Resize using OpenCV
resized_image = cv2.resize(image, (300, 300))

# Convert to Pillow Image for enhancements
pillow_image = Image.fromarray(cv2.cvtColor(resized_image, cv2.COLOR_BGR2RGB))

# Apply brightness enhancement
enhancer = ImageEnhance.Brightness(pillow_image)
brightened_image = enhancer.enhance(1.5)

# Save the final image
brightened_image.save('edited_example.jpg')
brightened_image.show()

4. Advanced Computer Vision¶

Object Detection with Pre-trained YOLO Model

From PyImageSearch:

In [25]:
import cv2
import numpy as np

# Load YOLO model
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
layer_names = net.getLayerNames()
output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers().flatten()]

# Load image
image = cv2.imread("example.jpg")
height, width, channels = image.shape

# Prepare image for YOLO
blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
outputs = net.forward(output_layers)

# Process detections
for output in outputs:
    for detection in output:
        scores = detection[5:]
        class_id = np.argmax(scores)
        confidence = scores[class_id]
        if confidence > 0.5:
            center_x = int(detection[0] * width)
            center_y = int(detection[1] * height)
            w = int(detection[2] * width)
            h = int(detection[3] * height)

            # Draw bounding box
            x = int(center_x - w / 2)
            y = int(center_y - h / 2)
            cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

cv2_imshow(image)
cv2.waitKey(0)
cv2.destroyAllWindows()
No description has been provided for this image

5. Machine Learning in Computer Vision¶

MNIST Digit Classification Using OpenCV

From Coursera:

In [19]:
import cv2
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Load MNIST dataset
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784')

X, y = mnist.data, mnist.target.astype(np.int)
X = np.array([cv2.resize(img.reshape(28, 28), (14, 14)).flatten() for img in X])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train logistic regression
clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)

# Evaluate
print("Accuracy:", clf.score(X_test, y_test))
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
/usr/lib/python3.11/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1347             try:
-> 1348                 h.request(req.get_method(), req.selector, req.data, headers,
   1349                           encode_chunked=req.has_header('Transfer-encoding'))

/usr/lib/python3.11/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1302         """Send a complete request to the server."""
-> 1303         self._send_request(method, url, body, headers, encode_chunked)
   1304 

/usr/lib/python3.11/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1348             body = _encode(body, 'body')
-> 1349         self.endheaders(body, encode_chunked=encode_chunked)
   1350 

/usr/lib/python3.11/http/client.py in endheaders(self, message_body, encode_chunked)
   1297             raise CannotSendHeader()
-> 1298         self._send_output(message_body, encode_chunked=encode_chunked)
   1299 

/usr/lib/python3.11/http/client.py in _send_output(self, message_body, encode_chunked)
   1057         del self._buffer[:]
-> 1058         self.send(msg)
   1059 

/usr/lib/python3.11/http/client.py in send(self, data)
    995             if self.auto_open:
--> 996                 self.connect()
    997             else:

/usr/lib/python3.11/http/client.py in connect(self)
   1467 
-> 1468             super().connect()
   1469 

/usr/lib/python3.11/http/client.py in connect(self)
    961         sys.audit("http.client.connect", self, self.host, self.port)
--> 962         self.sock = self._create_connection(
    963             (self.host,self.port), self.timeout, self.source_address)

/usr/lib/python3.11/socket.py in create_connection(address, timeout, source_address, all_errors)
    862             if not all_errors:
--> 863                 raise exceptions[0]
    864             raise ExceptionGroup("create_connection failed", exceptions)

/usr/lib/python3.11/socket.py in create_connection(address, timeout, source_address, all_errors)
    847                 sock.bind(source_address)
--> 848             sock.connect(sa)
    849             # Break explicitly a reference cycle

TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

URLError                                  Traceback (most recent call last)
<ipython-input-19-4db8929e7178> in <cell line: 0>()
      6 # Load MNIST dataset
      7 from sklearn.datasets import fetch_openml
----> 8 mnist = fetch_openml('mnist_784')
      9 
     10 X, y = mnist.data, mnist.target.astype(np.int)

/usr/local/lib/python3.11/dist-packages/sklearn/utils/_param_validation.py in wrapper(*args, **kwargs)
    214                     )
    215                 ):
--> 216                     return func(*args, **kwargs)
    217             except InvalidParameterError as e:
    218                 # When the function is just a wrapper around an estimator, we allow

/usr/local/lib/python3.11/dist-packages/sklearn/datasets/_openml.py in fetch_openml(name, version, data_id, data_home, target_column, cache, return_X_y, as_frame, n_retries, delay, parser, read_csv_kwargs)
   1009                 "both.".format(data_id, name)
   1010             )
-> 1011         data_info = _get_data_info_by_name(
   1012             name, version, data_home, n_retries=n_retries, delay=delay
   1013         )

/usr/local/lib/python3.11/dist-packages/sklearn/datasets/_openml.py in _get_data_info_by_name(name, version, data_home, n_retries, delay)
    300         url = _SEARCH_NAME.format(name) + "/status/active/"
    301         error_msg = "No active dataset {} found.".format(name)
--> 302         json_data = _get_json_content_from_openml_api(
    303             url,
    304             error_msg,

/usr/local/lib/python3.11/dist-packages/sklearn/datasets/_openml.py in _get_json_content_from_openml_api(url, error_message, data_home, n_retries, delay)
    244 
    245     try:
--> 246         return _load_json()
    247     except HTTPError as error:
    248         # 412 is an OpenML specific error code, indicating a generic error

/usr/local/lib/python3.11/dist-packages/sklearn/datasets/_openml.py in wrapper(*args, **kw)
     65                 return f(*args, **kw)
     66             try:
---> 67                 return f(*args, **kw)
     68             except URLError:
     69                 raise

/usr/local/lib/python3.11/dist-packages/sklearn/datasets/_openml.py in _load_json()
    239     def _load_json():
    240         with closing(
--> 241             _open_openml_url(url, data_home, n_retries=n_retries, delay=delay)
    242         ) as response:
    243             return json.loads(response.read().decode("utf-8"))

/usr/local/lib/python3.11/dist-packages/sklearn/datasets/_openml.py in _open_openml_url(openml_path, data_home, n_retries, delay)
    171             with TemporaryDirectory(dir=dir_name) as tmpdir:
    172                 with closing(
--> 173                     _retry_on_network_error(n_retries, delay, req.full_url)(urlopen)(
    174                         req
    175                     )

/usr/local/lib/python3.11/dist-packages/sklearn/datasets/_openml.py in wrapper(*args, **kwargs)
    101             while True:
    102                 try:
--> 103                     return f(*args, **kwargs)
    104                 except (URLError, TimeoutError) as e:
    105                     # 412 is a specific OpenML error code.

/usr/lib/python3.11/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    214     else:
    215         opener = _opener
--> 216     return opener.open(url, data, timeout)
    217 
    218 def install_opener(opener):

/usr/lib/python3.11/urllib/request.py in open(self, fullurl, data, timeout)
    517 
    518         sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 519         response = self._open(req, data)
    520 
    521         # post-process response

/usr/lib/python3.11/urllib/request.py in _open(self, req, data)
    534 
    535         protocol = req.type
--> 536         result = self._call_chain(self.handle_open, protocol, protocol +
    537                                   '_open', req)
    538         if result:

/usr/lib/python3.11/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    494         for handler in handlers:
    495             func = getattr(handler, meth_name)
--> 496             result = func(*args)
    497             if result is not None:
    498                 return result

/usr/lib/python3.11/urllib/request.py in https_open(self, req)
   1389 
   1390         def https_open(self, req):
-> 1391             return self.do_open(http.client.HTTPSConnection, req,
   1392                 context=self._context, check_hostname=self._check_hostname)
   1393 

/usr/lib/python3.11/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1349                           encode_chunked=req.has_header('Transfer-encoding'))
   1350             except OSError as err: # timeout error
-> 1351                 raise URLError(err)
   1352             r = h.getresponse()
   1353         except:

URLError: <urlopen error [Errno 110] Connection timed out>

6. Dlib¶

Dlib is a modern C++ toolkit with Python bindings that excels in machine learning, image processing, and facial recognition.

Key Features: Facial landmarks, face recognition, object detection.

Example: Facial Landmark Detection

In [26]:
import dlib
import cv2

# Load pre-trained shape predictor
shape_predictor = "shape_predictor_68_face_landmarks.dat"
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(shape_predictor)

# Read an image
image = cv2.imread("example.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = detector(gray)
for face in faces:
    landmarks = predictor(gray, face)
    for n in range(68):
        x = landmarks.part(n).x
        y = landmarks.part(n).y
        cv2.circle(image, (x, y), 2, (0, 255, 0), -1)

cv2_imshow(image)
cv2.waitKey(0)
cv2.destroyAllWindows()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-26-a115398fded4> in <cell line: 0>()
      5 shape_predictor = "shape_predictor_68_face_landmarks.dat"
      6 detector = dlib.get_frontal_face_detector()
----> 7 predictor = dlib.shape_predictor(shape_predictor)
      8 
      9 # Read an image

RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat

Example: Face Detection

In [27]:
import dlib
import cv2

detector = dlib.get_frontal_face_detector()
image = cv2.imread('example.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

faces = detector(gray)
for face in faces:
    x, y, w, h = (face.left(), face.top(), face.width(), face.height())
    cv2.rectangle(image, (x, y), (x+w, y+h), (255, 0, 0), 2)

cv2_imshow(image)
cv2.waitKey(0)
cv2.destroyAllWindows()
No description has been provided for this image

7. Scikit-Image¶

Scikit-Image is a library built on NumPy and SciPy, designed for advanced image processing tasks like segmentation, transformations, and filtering.

Key Features: Robust preprocessing and feature extraction tools.

Example: Edge Detection Using Canny Algorithm

In [28]:
from skimage import io, feature, color
import matplotlib.pyplot as plt

# Load and convert image to grayscale
image = io.imread('example.jpg')
gray_image = color.rgb2gray(image)

# Apply Canny edge detection
edges = feature.canny(gray_image, sigma=2.0)

# Display results
plt.figure(figsize=(8, 8))
plt.imshow(edges, cmap='gray')
plt.title('Canny Edge Detection')
plt.axis('off')
plt.show()
No description has been provided for this image

Example: Histogram Equalization

In [29]:
from skimage import io, exposure, color
import matplotlib.pyplot as plt

image = io.imread('example.jpg', as_gray=True)
equalized = exposure.equalize_hist(image)

plt.imshow(equalized, cmap='gray')
plt.title('Histogram Equalized')
plt.axis('off')
plt.show()
No description has been provided for this image

Example: Image Segmentation with Watershed Algorithm

In [32]:
from skimage import io, color
from skimage.segmentation import watershed
from skimage.feature import peak_local_max
from scipy import ndimage
import numpy as np
import matplotlib.pyplot as plt

# Load an image
image = io.imread('example.jpg', as_gray=True)

image_int = image.astype(np.int32)

# Generate a boolean mask for local maxima
local_max_mask = peak_local_max(
    -image,
    footprint=np.ones((3, 3)),
    labels=image_int,
    min_distance=10  # Adjust this parameter as needed
)

# Check if local_max_mask is empty
if not np.any(local_max_mask):
    raise ValueError("No local maxima found. Check the input image or adjust parameters.")

# Debug: Visualize local maxima
plt.imshow(local_max_mask, cmap='gray')
plt.title("Local Maxima Mask")
plt.show()

# Generate markers for watershed
markers = ndimage.label(local_max_mask)[0]

# Debug: Check the markers
print(f"Markers shape: {markers.shape}, Image shape: {image.shape}")

# Ensure markers and image shapes match
if markers.shape != image.shape:
    raise ValueError(f"Shape mismatch: markers {markers.shape}, image {image.shape}")

# Apply watershed segmentation
segmented = watershed(-image, markers, mask=image.astype(bool))

# Display the segmented regions
plt.imshow(segmented, cmap='nipy_spectral')
plt.title("Segmented Image")
plt.show()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-32-9e0fefec0fad> in <cell line: 0>()
     21 # Check if local_max_mask is empty
     22 if not np.any(local_max_mask):
---> 23     raise ValueError("No local maxima found. Check the input image or adjust parameters.")
     24 
     25 # Debug: Visualize local maxima

ValueError: No local maxima found. Check the input image or adjust parameters.

8. TensorFlow and Keras (For Deep Learning)¶

TensorFlow and Keras are frameworks for building and training deep learning models, often used for image classification, object detection, and more.

Example: Image Classification with Pre-trained MobileNetV2

In [30]:
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions
from tensorflow.keras.preprocessing.image import load_img, img_to_array
import numpy as np

# Load a pre-trained model
model = MobileNetV2(weights='imagenet')

# Load and preprocess the image
image = load_img('example.jpg', target_size=(224, 224))
image_array = img_to_array(image)
image_array = np.expand_dims(image_array, axis=0)
image_array = preprocess_input(image_array)

# Predict the class
predictions = model.predict(image_array)
decoded_predictions = decode_predictions(predictions, top=3)

# Display predictions
for _, label, confidence in decoded_predictions[0]:
    print(f"{label}: {confidence*100:.2f}%")
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5
14536120/14536120 ━━━━━━━━━━━━━━━━━━━━ 2s 0us/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 4s 4s/step
cardigan: 36.61%
kimono: 36.36%
stole: 2.41%

Example: Training a Simple CNN for MNIST Digits

In [34]:
import tensorflow as tf
from tensorflow.keras import layers, models

# Load data
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Preprocess
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

# Build model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 37s 19ms/step - accuracy: 0.8877 - loss: 0.4021
Epoch 2/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 39s 18ms/step - accuracy: 0.9756 - loss: 0.0865
Epoch 3/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 31s 17ms/step - accuracy: 0.9829 - loss: 0.0600
Epoch 4/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 41s 16ms/step - accuracy: 0.9857 - loss: 0.0507
Epoch 5/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 34s 18ms/step - accuracy: 0.9878 - loss: 0.0412
313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - accuracy: 0.9761 - loss: 0.0718
Out[34]:
[0.058390699326992035, 0.980400025844574]

9. PyTorch¶

PyTorch is a deep learning framework often used for custom neural networks and advanced computer vision tasks.

Example: Image Classification Using TorchVision’s ResNet

In [35]:
import torch
from torchvision import models, transforms
from PIL import Image

# Load pre-trained ResNet model
model = models.resnet50(pretrained=True)
model.eval()

# Preprocess the image
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

image = Image.open('example.jpg')
image_tensor = transform(image).unsqueeze(0)

# Predict
with torch.no_grad():
    outputs = model(image_tensor)
    _, predicted = outputs.max(1)
    print(f"Predicted class index: {predicted.item()}")
/usr/local/lib/python3.11/dist-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/usr/local/lib/python3.11/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
100%|██████████| 97.8M/97.8M [00:00<00:00, 122MB/s]
Predicted class index: 903

Example: Transfer Learning with ResNet

In [36]:
import torch
from torchvision import models, transforms
from PIL import Image

# Load pre-trained ResNet model
model = models.resnet18(pretrained=True)
model.eval()

# Preprocess an image
transform = transforms.Compose([transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor()])
image = Image.open("example.jpg")
image = transform(image).unsqueeze(0)

# Predict
with torch.no_grad():
    outputs = model(image)
    _, predicted = outputs.max(1)
print(f"Predicted Class: {predicted.item()}")
/usr/local/lib/python3.11/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100%|██████████| 44.7M/44.7M [00:00<00:00, 120MB/s]
Predicted Class: 869

10. SimpleITK¶

SimpleITK is designed for medical image processing and includes advanced techniques like registration and segmentation.

Key Features: Works with DICOM and other medical imaging formats.

Example: Reading and Displaying a Medical Image

In [38]:
!pip install SimpleITK
Collecting SimpleITK
  Downloading SimpleITK-2.4.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.9 kB)
Downloading SimpleITK-2.4.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (52.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 52.3/52.3 MB 13.2 MB/s eta 0:00:00
Installing collected packages: SimpleITK
Successfully installed SimpleITK-2.4.1
In [40]:
import SimpleITK as sitk
import matplotlib.pyplot as plt

# Read a medical image
image = sitk.ReadImage("medical_image.dcm")

# Convert to numpy array for visualization
image_array = sitk.GetArrayFromImage(image)

# Display the image
plt.imshow(image_array[0], cmap='gray')
plt.title("Medical Image")
plt.axis('off')
plt.show()
No description has been provided for this image

Example: Image Smoothing (Gaussian Blur)

In [41]:
import SimpleITK as sitk
import matplotlib.pyplot as plt

image = sitk.ReadImage('medical_image.dcm')
smoothed = sitk.SmoothingRecursiveGaussian(image, sigma=2)

plt.imshow(sitk.GetArrayFromImage(smoothed)[0], cmap='gray')
plt.show()
No description has been provided for this image

11. Mediapipe¶

Mediapipe, developed by Google, is a library for real-time face, hand, and pose tracking.

Key Features: Pre-trained models for real-time applications.

Example: Hand Detection

In [43]:
!pip install mediapipe
Collecting mediapipe
  Downloading mediapipe-0.10.20-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (9.7 kB)
Requirement already satisfied: absl-py in /usr/local/lib/python3.11/dist-packages (from mediapipe) (1.4.0)
Requirement already satisfied: attrs>=19.1.0 in /usr/local/lib/python3.11/dist-packages (from mediapipe) (24.3.0)
Requirement already satisfied: flatbuffers>=2.0 in /usr/local/lib/python3.11/dist-packages (from mediapipe) (24.12.23)
Requirement already satisfied: jax in /usr/local/lib/python3.11/dist-packages (from mediapipe) (0.4.33)
Requirement already satisfied: jaxlib in /usr/local/lib/python3.11/dist-packages (from mediapipe) (0.4.33)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.11/dist-packages (from mediapipe) (3.10.0)
Requirement already satisfied: numpy<2 in /usr/local/lib/python3.11/dist-packages (from mediapipe) (1.26.4)
Requirement already satisfied: opencv-contrib-python in /usr/local/lib/python3.11/dist-packages (from mediapipe) (4.10.0.84)
Requirement already satisfied: protobuf<5,>=4.25.3 in /usr/local/lib/python3.11/dist-packages (from mediapipe) (4.25.5)
Collecting sounddevice>=0.4.4 (from mediapipe)
  Downloading sounddevice-0.5.1-py3-none-any.whl.metadata (1.4 kB)
Requirement already satisfied: sentencepiece in /usr/local/lib/python3.11/dist-packages (from mediapipe) (0.2.0)
Requirement already satisfied: CFFI>=1.0 in /usr/local/lib/python3.11/dist-packages (from sounddevice>=0.4.4->mediapipe) (1.17.1)
Requirement already satisfied: ml-dtypes>=0.2.0 in /usr/local/lib/python3.11/dist-packages (from jax->mediapipe) (0.4.1)
Requirement already satisfied: opt-einsum in /usr/local/lib/python3.11/dist-packages (from jax->mediapipe) (3.4.0)
Requirement already satisfied: scipy>=1.10 in /usr/local/lib/python3.11/dist-packages (from jax->mediapipe) (1.13.1)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (4.55.3)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (1.4.8)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (24.2)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (11.1.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (3.2.1)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (2.8.2)
Requirement already satisfied: pycparser in /usr/local/lib/python3.11/dist-packages (from CFFI>=1.0->sounddevice>=0.4.4->mediapipe) (2.22)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.7->matplotlib->mediapipe) (1.17.0)
Downloading mediapipe-0.10.20-cp311-cp311-manylinux_2_28_x86_64.whl (35.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 35.6/35.6 MB 23.5 MB/s eta 0:00:00
Downloading sounddevice-0.5.1-py3-none-any.whl (32 kB)
Installing collected packages: sounddevice, mediapipe
Successfully installed mediapipe-0.10.20 sounddevice-0.5.1
In [31]:
import mediapipe as mp
import cv2

mp_hands = mp.solutions.hands
hands = mp_hands.Hands()
mp_drawing = mp.solutions.drawing_utils

# Start video capture
cap = cv2.VideoCapture(0)

while cap.isOpened():
    ret, frame = cap.read()
    image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    results = hands.process(image)

    if results.multi_hand_landmarks:
        for hand_landmarks in results.multi_hand_landmarks:
            mp_drawing.draw_landmarks(frame, hand_landmarks, mp_hands.HAND_CONNECTIONS)

    cv2.imshow('Hand Detection', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Example: Pose Estimation

In [32]:
import mediapipe as mp
import cv2

mp_pose = mp.solutions.pose
pose = mp_pose.Pose()
mp_drawing = mp.solutions.drawing_utils

cap = cv2.VideoCapture(0)

while cap.isOpened():
    ret, frame = cap.read()
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    results = pose.process(frame_rgb)

    if results.pose_landmarks:
        mp_drawing.draw_landmarks(frame, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)

    cv2.imshow('Pose Detection', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Example: Real-Time Face Mesh Detection

In [33]:
import cv2
import mediapipe as mp

mp_face_mesh = mp.solutions.face_mesh
mp_drawing = mp.solutions.drawing_utils

cap = cv2.VideoCapture(0)
with mp_face_mesh.FaceMesh(max_num_faces=1) as face_mesh:
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break

        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        results = face_mesh.process(frame_rgb)

        if results.multi_face_landmarks:
            for face_landmarks in results.multi_face_landmarks:
                mp_drawing.draw_landmarks(frame, face_landmarks, mp_face_mesh.FACE_CONNECTIONS)

        cv2.imshow('Face Mesh', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

12. Albumentations¶

Albumentations is a fast and flexible library for data augmentation.

Key Features: Advanced augmentation techniques for images.

Example: Augmenting an Image

In [34]:
import cv2
import albumentations as A
from albumentations.core.composition import OneOf
from albumentations.augmentations import transforms

# Define augmentations
augmentation = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.RandomBrightnessContrast(p=0.2),
    A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.05, rotate_limit=15, p=0.5),
])

# Load image
image = cv2.imread("example.jpg")

# Apply augmentations
augmented = augmentation(image=image)
cv2_imshow(augmented['image'])
cv2.waitKey(0)
cv2.destroyAllWindows()
/usr/local/lib/python3.11/dist-packages/albumentations/__init__.py:24: UserWarning: A new version of Albumentations is available: 2.0.0 (you have 1.4.20). Upgrade using: pip install -U albumentations. To disable automatic update checks, set the environment variable NO_ALBUMENTATIONS_UPDATE to 1.
  check_for_updates()
No description has been provided for this image

Example: Data Augmentation

In [35]:
import cv2
from albumentations import Compose, RandomBrightnessContrast, ShiftScaleRotate, HorizontalFlip

image = cv2.imread('example.jpg')
augmentations = Compose([
    HorizontalFlip(p=0.5),
    RandomBrightnessContrast(p=0.2),
    ShiftScaleRotate(shift_limit=0.05, scale_limit=0.05, rotate_limit=15, p=0.5),
])

augmented = augmentations(image=image)['image']
cv2_imshow(augmented)
cv2.waitKey(0)
cv2.destroyAllWindows()
No description has been provided for this image

Example: Applying Multiple Augmentations

In [36]:
import cv2
from albumentations import Compose, Blur, CLAHE, Flip, Rotate

# Load the image
image = cv2.imread('example.jpg')

# Define augmentations
augmentations = Compose([
    Blur(blur_limit=3, p=0.5),
    CLAHE(clip_limit=4.0, p=0.5),
    Flip(p=0.5),
    Rotate(limit=45, p=0.5),
])

# Apply augmentations
augmented = augmentations(image=image)['image']

# Show the result
cv2_imshow(augmented)
cv2.waitKey(0)
cv2.destroyAllWindows()
<ipython-input-36-2b74ecdefef8>:11: DeprecationWarning: Flip is deprecated. Consider using HorizontalFlip, VerticalFlip, RandomRotate90 or D4.
  Flip(p=0.5),
No description has been provided for this image

13. Detectron2¶

Example: Object Detection

In [2]:
!pip install git+https://github.com/facebookresearch/detectron2.git
Collecting git+https://github.com/facebookresearch/detectron2.git
  Cloning https://github.com/facebookresearch/detectron2.git to /tmp/pip-req-build-h2efrlaf
  Running command git clone --filter=blob:none --quiet https://github.com/facebookresearch/detectron2.git /tmp/pip-req-build-h2efrlaf
  Resolved https://github.com/facebookresearch/detectron2.git to commit 9604f5995cc628619f0e4fd913453b4d7d61db3f
  Preparing metadata (setup.py) ... done
Requirement already satisfied: Pillow>=7.1 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (11.1.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (3.10.0)
Requirement already satisfied: pycocotools>=2.0.2 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (2.0.8)
Requirement already satisfied: termcolor>=1.1 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (2.5.0)
Requirement already satisfied: yacs>=0.1.8 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (0.1.8)
Requirement already satisfied: tabulate in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (0.9.0)
Requirement already satisfied: cloudpickle in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (3.1.0)
Requirement already satisfied: tqdm>4.29.0 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (4.67.1)
Requirement already satisfied: tensorboard in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (2.17.1)
Collecting fvcore<0.1.6,>=0.1.5 (from detectron2==0.6)
  Using cached fvcore-0.1.5.post20221221-py3-none-any.whl
Requirement already satisfied: iopath<0.1.10,>=0.1.7 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (0.1.9)
Requirement already satisfied: omegaconf<2.4,>=2.1 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (2.3.0)
Requirement already satisfied: hydra-core>=1.1 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (1.3.2)
Requirement already satisfied: black in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (24.10.0)
Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (24.2)
Requirement already satisfied: numpy in /usr/local/lib/python3.11/dist-packages (from fvcore<0.1.6,>=0.1.5->detectron2==0.6) (1.26.4)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.11/dist-packages (from fvcore<0.1.6,>=0.1.5->detectron2==0.6) (6.0.2)
Requirement already satisfied: antlr4-python3-runtime==4.9.* in /usr/local/lib/python3.11/dist-packages (from hydra-core>=1.1->detectron2==0.6) (4.9.3)
Requirement already satisfied: portalocker in /usr/local/lib/python3.11/dist-packages (from iopath<0.1.10,>=0.1.7->detectron2==0.6) (3.1.1)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (4.55.3)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (1.4.8)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (3.2.1)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (2.8.2)
Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.11/dist-packages (from black->detectron2==0.6) (8.1.8)
Requirement already satisfied: mypy-extensions>=0.4.3 in /usr/local/lib/python3.11/dist-packages (from black->detectron2==0.6) (1.0.0)
Requirement already satisfied: pathspec>=0.9.0 in /usr/local/lib/python3.11/dist-packages (from black->detectron2==0.6) (0.12.1)
Requirement already satisfied: platformdirs>=2 in /usr/local/lib/python3.11/dist-packages (from black->detectron2==0.6) (4.3.6)
Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (1.4.0)
Requirement already satisfied: grpcio>=1.48.2 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (1.69.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (3.7)
Requirement already satisfied: protobuf!=4.24.0,>=3.19.6 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (4.25.5)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (75.1.0)
Requirement already satisfied: six>1.9 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (1.17.0)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (3.1.3)
Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.11/dist-packages (from werkzeug>=1.0.1->tensorboard->detectron2==0.6) (3.0.2)
Installing collected packages: fvcore
  Attempting uninstall: fvcore
    Found existing installation: fvcore 0.1.1.dev200512
    Uninstalling fvcore-0.1.1.dev200512:
      Successfully uninstalled fvcore-0.1.1.dev200512
Successfully installed fvcore-0.1.5.post20221221
In [4]:
!pip install fvcore==0.1.1.dev200512
Collecting fvcore==0.1.1.dev200512
  Downloading fvcore-0.1.1.dev200512.tar.gz (33 kB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in /usr/local/lib/python3.11/dist-packages (from fvcore==0.1.1.dev200512) (1.26.4)
Requirement already satisfied: yacs>=0.1.6 in /usr/local/lib/python3.11/dist-packages (from fvcore==0.1.1.dev200512) (0.1.8)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.11/dist-packages (from fvcore==0.1.1.dev200512) (6.0.2)
Requirement already satisfied: tqdm in /usr/local/lib/python3.11/dist-packages (from fvcore==0.1.1.dev200512) (4.67.1)
Requirement already satisfied: portalocker in /usr/local/lib/python3.11/dist-packages (from fvcore==0.1.1.dev200512) (3.1.1)
Requirement already satisfied: termcolor>=1.1 in /usr/local/lib/python3.11/dist-packages (from fvcore==0.1.1.dev200512) (2.5.0)
Requirement already satisfied: Pillow in /usr/local/lib/python3.11/dist-packages (from fvcore==0.1.1.dev200512) (11.1.0)
Requirement already satisfied: tabulate in /usr/local/lib/python3.11/dist-packages (from fvcore==0.1.1.dev200512) (0.9.0)
Building wheels for collected packages: fvcore
  Building wheel for fvcore (setup.py) ... done
  Created wheel for fvcore: filename=fvcore-0.1.1.dev200512-py3-none-any.whl size=40855 sha256=83d3f4c1ea644e2f92efd4ea385d578715a437cb5a39d390c28ffae05beba701
  Stored in directory: /root/.cache/pip/wheels/d9/06/dd/45fbf0c83e2544d7d6f93d830bb15d310c3a2e4dc764457589
Successfully built fvcore
Installing collected packages: fvcore
  Attempting uninstall: fvcore
    Found existing installation: fvcore 0.1.5.post20221221
    Uninstalling fvcore-0.1.5.post20221221:
      Successfully uninstalled fvcore-0.1.5.post20221221
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
detectron2 0.6 requires fvcore<0.1.6,>=0.1.5, but you have fvcore 0.1.1.dev200512 which is incompatible.
Successfully installed fvcore-0.1.1.dev200512
In [5]:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
import cv2
from google.colab import drive

drive.mount('/content/drive')


cfg = get_cfg()
cfg.merge_from_file("/content/drive/My Drive/detectron2-main/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5

predictor = DefaultPredictor(cfg)
image = cv2.imread("example.jpg")
outputs = predictor(image)

v = Visualizer(image[:, :, ::-1], metadata=None)
output_image = v.draw_instance_predictions(outputs["instances"].to("cpu"))

cv2_imshow(output_image.get_image()[:, :, ::-1])
cv2.waitKey(0)
cv2.destroyAllWindows()
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
No description has been provided for this image

14. Open3D¶

Purpose: Open3D focuses on 3D data visualization and manipulation, including point cloud operations.

Example: Visualizing a Point Cloud

In [6]:
!pip install open3d
Collecting open3d
  Downloading open3d-0.19.0-cp311-cp311-manylinux_2_31_x86_64.whl.metadata (4.3 kB)
Requirement already satisfied: numpy>=1.18.0 in /usr/local/lib/python3.11/dist-packages (from open3d) (1.26.4)
Collecting dash>=2.6.0 (from open3d)
  Downloading dash-2.18.2-py3-none-any.whl.metadata (10 kB)
Requirement already satisfied: werkzeug>=3.0.0 in /usr/local/lib/python3.11/dist-packages (from open3d) (3.1.3)
Requirement already satisfied: flask>=3.0.0 in /usr/local/lib/python3.11/dist-packages (from open3d) (3.1.0)
Requirement already satisfied: nbformat>=5.7.0 in /usr/local/lib/python3.11/dist-packages (from open3d) (5.10.4)
Collecting configargparse (from open3d)
  Downloading ConfigArgParse-1.7-py3-none-any.whl.metadata (23 kB)
Collecting ipywidgets>=8.0.4 (from open3d)
  Downloading ipywidgets-8.1.5-py3-none-any.whl.metadata (2.3 kB)
Collecting addict (from open3d)
  Downloading addict-2.4.0-py3-none-any.whl.metadata (1.0 kB)
Requirement already satisfied: pillow>=9.3.0 in /usr/local/lib/python3.11/dist-packages (from open3d) (11.1.0)
Requirement already satisfied: matplotlib>=3 in /usr/local/lib/python3.11/dist-packages (from open3d) (3.10.0)
Requirement already satisfied: pandas>=1.0 in /usr/local/lib/python3.11/dist-packages (from open3d) (2.2.2)
Requirement already satisfied: pyyaml>=5.4.1 in /usr/local/lib/python3.11/dist-packages (from open3d) (6.0.2)
Requirement already satisfied: scikit-learn>=0.21 in /usr/local/lib/python3.11/dist-packages (from open3d) (1.6.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.11/dist-packages (from open3d) (4.67.1)
Collecting pyquaternion (from open3d)
  Downloading pyquaternion-0.9.9-py3-none-any.whl.metadata (1.4 kB)
Collecting flask>=3.0.0 (from open3d)
  Downloading flask-3.0.3-py3-none-any.whl.metadata (3.2 kB)
Collecting werkzeug>=3.0.0 (from open3d)
  Downloading werkzeug-3.0.6-py3-none-any.whl.metadata (3.7 kB)
Requirement already satisfied: plotly>=5.0.0 in /usr/local/lib/python3.11/dist-packages (from dash>=2.6.0->open3d) (5.24.1)
Collecting dash-html-components==2.0.0 (from dash>=2.6.0->open3d)
  Downloading dash_html_components-2.0.0-py3-none-any.whl.metadata (3.8 kB)
Collecting dash-core-components==2.0.0 (from dash>=2.6.0->open3d)
  Downloading dash_core_components-2.0.0-py3-none-any.whl.metadata (2.9 kB)
Collecting dash-table==5.0.0 (from dash>=2.6.0->open3d)
  Downloading dash_table-5.0.0-py3-none-any.whl.metadata (2.4 kB)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.11/dist-packages (from dash>=2.6.0->open3d) (8.5.0)
Requirement already satisfied: typing-extensions>=4.1.1 in /usr/local/lib/python3.11/dist-packages (from dash>=2.6.0->open3d) (4.12.2)
Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from dash>=2.6.0->open3d) (2.32.3)
Collecting retrying (from dash>=2.6.0->open3d)
  Downloading retrying-1.3.4-py3-none-any.whl.metadata (6.9 kB)
Requirement already satisfied: nest-asyncio in /usr/local/lib/python3.11/dist-packages (from dash>=2.6.0->open3d) (1.6.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.11/dist-packages (from dash>=2.6.0->open3d) (75.1.0)
Requirement already satisfied: Jinja2>=3.1.2 in /usr/local/lib/python3.11/dist-packages (from flask>=3.0.0->open3d) (3.1.5)
Requirement already satisfied: itsdangerous>=2.1.2 in /usr/local/lib/python3.11/dist-packages (from flask>=3.0.0->open3d) (2.2.0)
Requirement already satisfied: click>=8.1.3 in /usr/local/lib/python3.11/dist-packages (from flask>=3.0.0->open3d) (8.1.8)
Requirement already satisfied: blinker>=1.6.2 in /usr/local/lib/python3.11/dist-packages (from flask>=3.0.0->open3d) (1.9.0)
Collecting comm>=0.1.3 (from ipywidgets>=8.0.4->open3d)
  Downloading comm-0.2.2-py3-none-any.whl.metadata (3.7 kB)
Requirement already satisfied: ipython>=6.1.0 in /usr/local/lib/python3.11/dist-packages (from ipywidgets>=8.0.4->open3d) (7.34.0)
Requirement already satisfied: traitlets>=4.3.1 in /usr/local/lib/python3.11/dist-packages (from ipywidgets>=8.0.4->open3d) (5.7.1)
Collecting widgetsnbextension~=4.0.12 (from ipywidgets>=8.0.4->open3d)
  Downloading widgetsnbextension-4.0.13-py3-none-any.whl.metadata (1.6 kB)
Requirement already satisfied: jupyterlab-widgets~=3.0.12 in /usr/local/lib/python3.11/dist-packages (from ipywidgets>=8.0.4->open3d) (3.0.13)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3->open3d) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3->open3d) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3->open3d) (4.55.3)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3->open3d) (1.4.8)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3->open3d) (24.2)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3->open3d) (3.2.1)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3->open3d) (2.8.2)
Requirement already satisfied: fastjsonschema>=2.15 in /usr/local/lib/python3.11/dist-packages (from nbformat>=5.7.0->open3d) (2.21.1)
Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.11/dist-packages (from nbformat>=5.7.0->open3d) (4.23.0)
Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in /usr/local/lib/python3.11/dist-packages (from nbformat>=5.7.0->open3d) (5.7.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas>=1.0->open3d) (2024.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas>=1.0->open3d) (2024.2)
Requirement already satisfied: scipy>=1.6.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn>=0.21->open3d) (1.13.1)
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn>=0.21->open3d) (1.4.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn>=0.21->open3d) (3.5.0)
Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.11/dist-packages (from werkzeug>=3.0.0->open3d) (3.0.2)
Collecting jedi>=0.16 (from ipython>=6.1.0->ipywidgets>=8.0.4->open3d)
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Requirement already satisfied: decorator in /usr/local/lib/python3.11/dist-packages (from ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (4.4.2)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.11/dist-packages (from ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (0.7.5)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.11/dist-packages (from ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (3.0.48)
Requirement already satisfied: pygments in /usr/local/lib/python3.11/dist-packages (from ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (2.18.0)
Requirement already satisfied: backcall in /usr/local/lib/python3.11/dist-packages (from ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (0.2.0)
Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.11/dist-packages (from ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (0.1.7)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.11/dist-packages (from ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (4.9.0)
Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=2.6->nbformat>=5.7.0->open3d) (24.3.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=2.6->nbformat>=5.7.0->open3d) (2024.10.1)
Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=2.6->nbformat>=5.7.0->open3d) (0.35.1)
Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=2.6->nbformat>=5.7.0->open3d) (0.22.3)
Requirement already satisfied: platformdirs>=2.5 in /usr/local/lib/python3.11/dist-packages (from jupyter-core!=5.0.*,>=4.12->nbformat>=5.7.0->open3d) (4.3.6)
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.11/dist-packages (from plotly>=5.0.0->dash>=2.6.0->open3d) (9.0.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.7->matplotlib>=3->open3d) (1.17.0)
Requirement already satisfied: zipp>=3.20 in /usr/local/lib/python3.11/dist-packages (from importlib-metadata->dash>=2.6.0->open3d) (3.21.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->dash>=2.6.0->open3d) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests->dash>=2.6.0->open3d) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests->dash>=2.6.0->open3d) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests->dash>=2.6.0->open3d) (2024.12.14)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /usr/local/lib/python3.11/dist-packages (from jedi>=0.16->ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (0.8.4)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.11/dist-packages (from pexpect>4.3->ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (0.7.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.11/dist-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=6.1.0->ipywidgets>=8.0.4->open3d) (0.2.13)
Downloading open3d-0.19.0-cp311-cp311-manylinux_2_31_x86_64.whl (447.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 447.7/447.7 MB 4.1 MB/s eta 0:00:00
Downloading dash-2.18.2-py3-none-any.whl (7.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.8/7.8 MB 111.7 MB/s eta 0:00:00
Downloading dash_core_components-2.0.0-py3-none-any.whl (3.8 kB)
Downloading dash_html_components-2.0.0-py3-none-any.whl (4.1 kB)
Downloading dash_table-5.0.0-py3-none-any.whl (3.9 kB)
Downloading flask-3.0.3-py3-none-any.whl (101 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 101.7/101.7 kB 10.3 MB/s eta 0:00:00
Downloading ipywidgets-8.1.5-py3-none-any.whl (139 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.8/139.8 kB 14.4 MB/s eta 0:00:00
Downloading werkzeug-3.0.6-py3-none-any.whl (227 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 228.0/228.0 kB 23.0 MB/s eta 0:00:00
Downloading addict-2.4.0-py3-none-any.whl (3.8 kB)
Downloading ConfigArgParse-1.7-py3-none-any.whl (25 kB)
Downloading pyquaternion-0.9.9-py3-none-any.whl (14 kB)
Downloading comm-0.2.2-py3-none-any.whl (7.2 kB)
Downloading widgetsnbextension-4.0.13-py3-none-any.whl (2.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 81.2 MB/s eta 0:00:00
Downloading retrying-1.3.4-py3-none-any.whl (11 kB)
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 65.8 MB/s eta 0:00:00
Installing collected packages: dash-table, dash-html-components, dash-core-components, addict, widgetsnbextension, werkzeug, retrying, pyquaternion, jedi, configargparse, comm, flask, ipywidgets, dash, open3d
  Attempting uninstall: widgetsnbextension
    Found existing installation: widgetsnbextension 3.6.10
    Uninstalling widgetsnbextension-3.6.10:
      Successfully uninstalled widgetsnbextension-3.6.10
  Attempting uninstall: werkzeug
    Found existing installation: Werkzeug 3.1.3
    Uninstalling Werkzeug-3.1.3:
      Successfully uninstalled Werkzeug-3.1.3
  Attempting uninstall: flask
    Found existing installation: Flask 3.1.0
    Uninstalling Flask-3.1.0:
      Successfully uninstalled Flask-3.1.0
  Attempting uninstall: ipywidgets
    Found existing installation: ipywidgets 7.7.1
    Uninstalling ipywidgets-7.7.1:
      Successfully uninstalled ipywidgets-7.7.1
Successfully installed addict-2.4.0 comm-0.2.2 configargparse-1.7 dash-2.18.2 dash-core-components-2.0.0 dash-html-components-2.0.0 dash-table-5.0.0 flask-3.0.3 ipywidgets-8.1.5 jedi-0.19.2 open3d-0.19.0 pyquaternion-0.9.9 retrying-1.3.4 werkzeug-3.0.6 widgetsnbextension-4.0.13
In [9]:
import open3d as o3d

# Load a point cloud
pcd = o3d.io.read_point_cloud("point_cloud.ply")

# Save point cloud to a file
o3d.io.write_point_cloud("saved_point_cloud.ply", pcd)

# Visualize the point cloud
o3d.visualization.draw_geometries([pcd])
[Open3D WARNING] GLFW initialized for headless rendering.
[Open3D WARNING] GLFW Error: OSMesa: Library not found
[Open3D WARNING] Failed to create window
[Open3D WARNING] [DrawGeometries] Failed creating OpenGL window.

15. Kornia¶

Purpose: Kornia is a PyTorch-based library for differentiable image transformations. It’s used for tasks that require backpropagation through vision operations.

Example: Differentiable Image Rotation

In [11]:
!pip install kornia
Collecting kornia
  Downloading kornia-0.8.0-py2.py3-none-any.whl.metadata (17 kB)
Collecting kornia_rs>=0.1.0 (from kornia)
  Downloading kornia_rs-0.1.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (10 kB)
Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from kornia) (24.2)
Requirement already satisfied: torch>=1.9.1 in /usr/local/lib/python3.11/dist-packages (from kornia) (2.5.1+cu121)
Requirement already satisfied: filelock in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (3.16.1)
Requirement already satisfied: typing-extensions>=4.8.0 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (4.12.2)
Requirement already satisfied: networkx in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (3.4.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (3.1.5)
Requirement already satisfied: fsspec in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (2024.10.0)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (12.1.105)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (12.1.105)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (12.1.105)
Requirement already satisfied: nvidia-cudnn-cu12==9.1.0.70 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (9.1.0.70)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (12.1.3.1)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (11.0.2.54)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (10.3.2.106)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (11.4.5.107)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (12.1.0.106)
Requirement already satisfied: nvidia-nccl-cu12==2.21.5 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (2.21.5)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (12.1.105)
Requirement already satisfied: triton==3.1.0 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (3.1.0)
Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.11/dist-packages (from torch>=1.9.1->kornia) (1.13.1)
Requirement already satisfied: nvidia-nvjitlink-cu12 in /usr/local/lib/python3.11/dist-packages (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.9.1->kornia) (12.6.85)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from sympy==1.13.1->torch>=1.9.1->kornia) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2->torch>=1.9.1->kornia) (3.0.2)
Downloading kornia-0.8.0-py2.py3-none-any.whl (1.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/1.1 MB ? eta -:--:--
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 53.0 MB/s eta 0:00:00
Downloading kornia_rs-0.1.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 80.4 MB/s eta 0:00:00
Installing collected packages: kornia_rs, kornia
Successfully installed kornia-0.8.0 kornia_rs-0.1.8
In [12]:
import torch
import kornia
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np

# Load an image
image = Image.open("example.jpg")
image_tensor = torch.tensor(np.array(image)).permute(2, 0, 1).float() / 255.0
image_tensor = image_tensor.unsqueeze(0)

# Rotate the image
angle = torch.tensor([45.0])  # Rotate 45 degrees
rotated_image = kornia.geometry.transform.rotate(image_tensor, angle)

# Visualize
rotated_image_np = rotated_image.squeeze(0).permute(1, 2, 0).numpy()
plt.imshow(rotated_image_np)
plt.title("Rotated Image")
plt.axis('off')
plt.show()
No description has been provided for this image

Project 1: Real-Time Object Detection with YOLOv4

Goal: Detect objects in real-time using a webcam and YOLOv4.

In [16]:
import cv2
import numpy as np
from google.colab import files

# Load YOLO model
net = cv2.dnn.readNet('yolov3.weights', 'yolov3.cfg')
with open('coco.names', 'r') as f:
    classes = [line.strip() for line in f.readlines()]

layer_names = net.getLayerNames()
output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers().flatten()]


uploaded = files.upload()
video_path = list(uploaded.keys())[0]
cap = cv2.VideoCapture(video_path)
# Get video properties
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

while True:
    ret, frame = cap.read()
    height, width, channels = frame.shape

    # Preprocess image
    blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
    net.setInput(blob)
    outputs = net.forward(output_layers)

    # Parse detections
    for output in outputs:
        for detection in output:
            scores = detection[5:]
            class_id = np.argmax(scores)
            confidence = scores[class_id]
            if confidence > 0.5:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * height)
                w = int(detection[2] * width)
                h = int(detection[3] * height)

                # Draw bounding box
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
                cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 0), 2)
                cv2.putText(frame, f"{classes[class_id]} {confidence:.2f}", (x, y - 10),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 2)

    cv2_imshow(frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()
Upload widget is only available when the cell has been executed in the current browser session. Please rerun this cell to enable.
Saving Jobs_2.mp4 to Jobs_2.mp4
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-16-06eb664bd512> in <cell line: 0>()
     27     blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
     28     net.setInput(blob)
---> 29     outputs = net.forward(output_layers)
     30 
     31     # Parse detections

KeyboardInterrupt: 

Project 2: Image Classification with Pre-Trained MobileNet

Goal: Classify an image using MobileNet with TensorFlow.

In [17]:
import tensorflow as tf
from tensorflow.keras.applications import MobileNet
from tensorflow.keras.applications.mobilenet import preprocess_input, decode_predictions
from tensorflow.keras.preprocessing.image import load_img, img_to_array

# Load the model
model = MobileNet(weights='imagenet')

# Load and preprocess the image
image = load_img('example.jpg', target_size=(224, 224))
image_array = img_to_array(image)
image_array = preprocess_input(image_array)
image_array = tf.expand_dims(image_array, axis=0)

# Predict the class
predictions = model.predict(image_array)
decoded_predictions = decode_predictions(predictions, top=3)

for idx, (imagenet_id, label, score) in enumerate(decoded_predictions[0]):
    print(f"{idx+1}. {label}: {score*100:.2f}%")
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet/mobilenet_1_0_224_tf.h5
17225924/17225924 ━━━━━━━━━━━━━━━━━━━━ 2s 0us/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 4s 4s/step
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
35363/35363 ━━━━━━━━━━━━━━━━━━━━ 0s 1us/step
1. jean: 28.12%
2. theater_curtain: 21.58%
3. cardigan: 11.64%

Project 3: Real-Time Pose Estimation with Mediapipe

Goal: Create a system to estimate human pose in real-time.

In [18]:
!pip install mediapipe
Collecting mediapipe
  Downloading mediapipe-0.10.20-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (9.7 kB)
Requirement already satisfied: absl-py in /usr/local/lib/python3.11/dist-packages (from mediapipe) (1.4.0)
Requirement already satisfied: attrs>=19.1.0 in /usr/local/lib/python3.11/dist-packages (from mediapipe) (24.3.0)
Requirement already satisfied: flatbuffers>=2.0 in /usr/local/lib/python3.11/dist-packages (from mediapipe) (24.12.23)
Requirement already satisfied: jax in /usr/local/lib/python3.11/dist-packages (from mediapipe) (0.4.33)
Requirement already satisfied: jaxlib in /usr/local/lib/python3.11/dist-packages (from mediapipe) (0.4.33)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.11/dist-packages (from mediapipe) (3.10.0)
Requirement already satisfied: numpy<2 in /usr/local/lib/python3.11/dist-packages (from mediapipe) (1.26.4)
Requirement already satisfied: opencv-contrib-python in /usr/local/lib/python3.11/dist-packages (from mediapipe) (4.10.0.84)
Requirement already satisfied: protobuf<5,>=4.25.3 in /usr/local/lib/python3.11/dist-packages (from mediapipe) (4.25.5)
Collecting sounddevice>=0.4.4 (from mediapipe)
  Downloading sounddevice-0.5.1-py3-none-any.whl.metadata (1.4 kB)
Requirement already satisfied: sentencepiece in /usr/local/lib/python3.11/dist-packages (from mediapipe) (0.2.0)
Requirement already satisfied: CFFI>=1.0 in /usr/local/lib/python3.11/dist-packages (from sounddevice>=0.4.4->mediapipe) (1.17.1)
Requirement already satisfied: ml-dtypes>=0.2.0 in /usr/local/lib/python3.11/dist-packages (from jax->mediapipe) (0.4.1)
Requirement already satisfied: opt-einsum in /usr/local/lib/python3.11/dist-packages (from jax->mediapipe) (3.4.0)
Requirement already satisfied: scipy>=1.10 in /usr/local/lib/python3.11/dist-packages (from jax->mediapipe) (1.13.1)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (4.55.3)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (1.4.8)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (24.2)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (11.1.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (3.2.1)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib->mediapipe) (2.8.2)
Requirement already satisfied: pycparser in /usr/local/lib/python3.11/dist-packages (from CFFI>=1.0->sounddevice>=0.4.4->mediapipe) (2.22)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.7->matplotlib->mediapipe) (1.17.0)
Downloading mediapipe-0.10.20-cp311-cp311-manylinux_2_28_x86_64.whl (35.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 35.6/35.6 MB 18.6 MB/s eta 0:00:00
Downloading sounddevice-0.5.1-py3-none-any.whl (32 kB)
Installing collected packages: sounddevice, mediapipe
Successfully installed mediapipe-0.10.20 sounddevice-0.5.1
In [19]:
import cv2
import mediapipe as mp

mp_pose = mp.solutions.pose
mp_drawing = mp.solutions.drawing_utils

cap = cv2.VideoCapture(0)

with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
    while cap.isOpened():
        ret, frame = cap.read()
        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        results = pose.process(frame_rgb)

        if results.pose_landmarks:
            mp_drawing.draw_landmarks(frame, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)

        cv2.imshow('Pose Estimation', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()